Konuşma Tanima İçi̇n Heteroskedasti̇k Ayirtaç Anali̇zi̇ni̇n Düzenli̇leşti̇ri̇lmesi̇ Regularizing Heteroschedastic Discriminant Analysis for Speech Recognition
نویسنده
چکیده
Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced static MFCC features yields important performance gains as compared to MFCC+dynamic features in most speech recognition tasks. It is reasonable to regularize LDA transform computation for stability. In this paper, we regularize LDA and heteroschedastic LDA transforms using two methods: (1) Statistical priors for estimating the transform, (2) Structural constraints on the transform. Our structural constraint imposes a block structured LDA transform where each block acts on the same cepstral parameters across frames. The second approach suggests using new coefficients for static, first difference and second difference operators as compared to the standard ones. We test the new algorithms on two different tasks, TIMIT and AURORA2. We obtain consistent improvement over standard MFCC features. We also improve upon LDA+MLLT features for certain noise levels in AURORA2 tests.
منابع مشابه
Regularizing linear discriminant analysis for speech recognition
Feature extraction is an essential first step in speech recognition applications. In addition to static features extracted from each frame of speech data, it is beneficial to use dynamic features (called ∆ and ∆∆ coefficients) that use information from neighboring frames. Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملSubspace Kernel Discriminant Analysis for Speech Recognition
Kernel Discriminant Analysis (KDA) has been successfully applied to many pattern recognition problems. KDA transforms the original problem into a space of dimension N where N is the number of training vectors. For speech recognition, N is usually prohibitively high increasing computational requirements beyond current computational capabilities. In this paper, we provide a formulation of a subsp...
متن کامل